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1.  INTRODUCTION 


Random  outcomes  can  often  produce  significant  effects  on  planning  de¬ 
cisions  that  consider  several  time  periods.  Multistage  stochastic  programs 
can  model  these  decisions  but  implementations  are  generally  restricted  to  a 
limited  number  of  scenarios  in  each  period.  We  present  an  alternative  ap¬ 
proximation  scheme  that  can  obtain  lower  and  upper  bounds  on  the  optimal 
objective  value  in  these  stochastic  programs. 

The  method  is  based  on  building  response  functions  to  future  outcomes 
that  depend  separably  on  the  variation  of  random  parameters  around  the 
limited  set  of  scenarios  that  is  initially  provided.  For  stochastic  linear 
programs,  the  resulting  optimization  problem  involves  an  objective  with 
a  limited  number  of  nonlinear  terms  subject  to  linear  constraints.  The 
method  can  be  incorporated  into  various  alternative  procedures  for  solving  1 
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2  J.R.  BIRGE 

multistage  stochastic  linear  programs  with  finite  numbers  of  scenarios. 

:  i-  — In  thi6-Rext  section,  wo-,  discuss-  the  basic  model  and  alternative  ap¬ 
proaches.  -We  thon  describe  the  basic  properties  of  piecewise  linear  response 
functions.  The  fourth  section  presents  a  basic  model  for  a  single  scenario 
and  randomness  restricted  to  constraint  levels.  The  fifth  section  extends 
this  to  multiple  scenarios  with  varying  scenario  ranges  and  to  possibilities 


for  randomness  among  the  constraint  vectors. 
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2.  THE  BASIC  MODEL  AND  APPROACHES 

The  multistage  stochastic  programming  model  has  been  applied  to  re¬ 
source  planning  in  a  variety  of  contexts  (see,  for  example,  the  references 
in  Ermoliev  and  Wets  (1988)).  The  general  multistage  stochastic  program¬ 
ming  problem  is: 

T 

min  eE  ft(xt)]  s.  t.  gi(xi)  =  0,gt(xt_i,xt)  =  0,  a.  s.,t  =  1 . T, 

t=i 

(1) 

and  x  nonanticipative.  This  last  constraint  restricts  xt  to  depend  only  on 
the  outcomes  of  any  random  variables  up  to  time  t. 

In  the  following,  we  restrict  the  problem  in  ( 1)  to  the  case  of  linear  /  and 
g  and  only  allow  randomness  to  enter  the  right-hand  sides  of  the  constraints. 
In  this  case,  the  problem  becomes: 


min  cfxi  +  £'{c2X2-|-  ...  +  cjxt] 


subject  to  AjXi 

—  t>o, 

(2) 

Bt- iXf_i 

+.4(X(  =  £t>  t  =  2, . 

...T, 

H 

IV 

o 

t  =  1 . T, 

where  the  sizes  of  the  matrices  are 

consistent  with  xt  6  9?n‘,< 

=  1... 

..r. 

and  6i  €  Rm',and  the  random  vectors,  £<  £  3im'.  Note  that  this  implies 
that  the  decisions  xt  in  future  periods  are  random  and  depend  on  the  past 
outcomes. 

Many  possibilities  have  been  suggested  for  solving  (2).  We  describe  these 
briefly  below. 
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One  approach  to  (2)  is  to  assume  that  the  data  are  Markovian  and  to  fit 
the  stochastic  program  into  the  framework  of  Markov  decision  processes. 
Grinold  showed  how  this  could  be  done  in  (1976).  For  computation,  it  would 
generally  be  necessary  to  restrict  the  outcomes  from  one  period  to  some 
finite  set.  This  may  not,  however,  be  possible  in  general  and  introduces 
error  that  may  not  be  easily  quantifiable. 

A  similar  approach  is  to  fit  the  stochastic  program  into  the  framework  of 
a  dynamic  program  and  to  approximate  the  value  function  using  suitable 
functions.  The  value  function  is  defined  recursively  using: 

Vr_i(xT-i)  =  E  [min  fr(xT)  subject  to  gT(xT-i.XT)  =  0], 


and  the  recursion: 


U,(xt)  =  E  [min  ft+1(xt+1)  +  Vt+1(xt+i)  subject  to  gt+1(xt,xt+1)  =  0], 


This  approximation  appears  in  Beale,  Dantzig,  Watson  (1986),  by  as¬ 
suming  a  suitable  functional  for  Vt  (for  example,  a  quadratic  function)  and 
then  fitting  this  function  using  the  mean  value  and  the  mean  and  standard 
deviations  of  the  random  variables.  This  has  an  especially  convenient  form 
for  production  problems.  The  piecewise  linear  approximation  discussed  in 
this  paper  follows  many  of  the  basic  ideas  in  this  approach. 

The  use  of  scenarios  is  also  common  in  implementations.  In  this  case, 
a  scenario  corresponds  to  a  specific  realization  of  £2.  •  ,£r-  The  scenario 

approach  uses  a  limited  number  of  these  realizations  and  combines  the  so¬ 
lutions  in  some  suitable  way  to  obtain  a  solution  to  the  original  problem. 
Rockafellar  and  Wets  (1988)  show  how  this  can  be  implemented  in  a  proce¬ 
dure  that  converges  to  the  solution  of  the  stochastic  program  (1)  in  which 
all  of  the  scenarios  are  included. 

Implicit  in  this  approach  and  other  procedures  with  finite  numbers  of 
scenarios  is  some  discretization  and  aggregation  of  random  variables.  The 
basic  idea  behind  this  approach  is  to  combine  severed  periods  and  scenarios 
(realizations  of  the  random  variable)  together  into  an  aggregated  problem. 
For  example,  the  last  T  —  k  +  1  periods  may  be  replaced  by  an  objective, 
/*(xk)  =  EtL  k**ft(Ttxk),  and  the  constraints  become  y*(xk.Tixk)  = 
Et=k  atgt(Ttxk,  Tt+ixk  ),  where  the  T)  are  appropriate  shift  operators  to 
have  xk  act  on  the  correct  underlying  stochastic  element. 
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Lower  bounds  are  available  in  this  approach  by  using  expected  values 
when  the  functions  /  are  convex  and  the  constraints  are  linear.  Upper 
bounds  are  more  difficult  to  achieve.  For  linear  /,  Birge  ( 1985a)  develops 
upper  and  lower  bounds  that  assume  bounded  x  and  linear  penalties  for 
violations  of  the  constraints.  They  are  based  on  constructing  feasible  primal 
and  dual  solutions.  The  assumptions  can  be  relaxed  for  problems  with 
special  structure,  such  as  production  problems  (see  Birge, 1985b).  It  may 
also  be  possible  to  develop  bounds  on  the  primal  and  dual  variables  using 
the  sublinear  approximation  procedure. 

The  piecewise  linear  approximation  presented  here  is  an  extension  of  this 
approach  in  which  we  find  functions  that  approximate  (and  bound)  response 
to  variations  about  the  random  parameter.  The  next  section  describes  the 
basic  principles  behind  this  approach. 


3.  PIECEWISE  LINEAR  APPROXIMATIONS 


The  difficulty  with  discretization  approaches  is  that  many  functions  eval¬ 
uations  are  required  to  obtain  bounds  on  the  optimal  values  (see  Birge  and 
Wets  (1986a)).  The  piecewise  linear  response  approach  is  an  alternative 
that  can  reduce  work  considerably.  The  basis  for  this  is  the  ray  approxi¬ 
mation  procedure  in  Birge  and  Wets  (1986a,b).  This  uses  the  sublinearity 
property  of  the  recourse  function  to  obtain  a  separable  function  that  ma¬ 
jorizes  Vt,  the  value  function  at  stage  t.  This  approach  is  generalized  in 
Birge  and  Wets  (1987,  1989).  Wallace  (1987),  on  the  other  hand,  formu¬ 
lated  a  procedure  that  applies  to  problems  in  which  the  recourse  function 
involves  the  solution  of  a  network  problem.  The  piecewise  linear  procedure 
in  Birge  and  Wallace  (1988)  is  a  combination  and  generalization  of  these 
two  basic  approaches  for  two-stage  problems  (T  =2). 

The  two-stage  algorithm  provides  a  separable  piecewise  linear  function 
that  can  bound  Vj  throughout  the  support  of  the  random  variables  and  can 
be  easily  evaluated.  This  procedure  can  be  extended  to  multiple  stages  as 
we  show  in  the  next  section.  We  first  show  the  results  for  one  period. 


To  simplify  notation  and  to  establish  general  results,  we  consider  the 
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following  system  : 


A1x  =  61+£,  A2x  =  b2,  0  <  x  <  c  +  <t>  (3) 

where  A\  £  3tm‘xr*.  A2  €  (Ai|A.2)t  =  A  is  the  coefficient 

matrix,  (&i|62)r  =  6  is  the  fixed  part  of  the  right-hand  side,  c  is  the  fixed 
part  of  the  bounds  on  the  variables,  £  is  the  random  availability  of  resources 
and  </>  is  the  random  part  of  the  variable  capacities,  where  4>  >  0.  We  assume 
that  there  is  a  positive  probability  that  6  =  0.  Next  define  Q(£,d>)  by 

Q(£,<p)  =  min{?rx|(3)}.  (4) 

Finally  define  \(£,0,d_,d+)  as  the  set  of  x-vectors  satisfying 

Aix  =  A2x  =  0,  d~  <  x  <  <j>  +  d+ .  (5) 


Our  goal  is  to  find  an  upper  bound  on  Q(£,4>),  or,  more  precisely,  on 
EQ(£,  4>) ■  We  do  this  by  finding  a  separable  piecewise  linear  function  U(£,<j>) 
defined  by 


U(S,<j>)  =  Q(i,0)  +  H(t)  + 


qTxl+(Zi-Z,) 

qT*’~(Zi  -S>) 


if  Si>Si, 
if  £  <  Si 


where  =  E£,-,  and  H{4>)  is  a  piecewise  linear  function  in  < i>.  The  last  term 
can  also  be  replaced  by  a  general  piecewise  linear  function,  /(£,  —  £,).  to 
obtain  sharper  bounds. 


4.  MULTISTAGE  APPROXIMATIONS 


The  basic  idea  is  to  use  the  two-stage  method  repeatedly  to  approximate 
the  objective  function  by  separable  functions.  For  linear  problems,  this 
leads  to  sublinear  or  piecewise  linear  functions  as  in  the  previous  section. 
Functions  without  recession  directions  (e.g.,  quadratic  functions)  would  re¬ 
quire  some  type  of  nonlinear  (  e  g.,  quadratic)  function  that  should  again 
be  easily  integrable,  requiring,  for  example,  limited  moment  information 
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(second  moments  for  quadratic  functions).  We  consider  the  linear  case 
below. 

The  goal  is  to  construct  a  problem  that  is  separable  in  the  components 
of  the  random  vector.  Let  rjt  be  the  random  right-hand  side  that  appears 
in  period  t.  We  apply  the  two-stage  approach  to  this  problem  and  then 
use  the  resulting  solution  as  an  input  to  the  next  stage  of  the  problem. 
This  random  vector  77  depends  on  the  first-stage  decision.  Its  distribution 
is  found  directly  (as,  e.g.,  a  piecewise  normal  distribution  if  the  original 
random  elements  all  have  normal  distributions),  or  it  can  be  approximated 
using  moment  information.  In  this  case,  the  distribution  that  solves  the 
appropriate  moment  problem  can  be  used  to  provide  a  bound. 

The  problem  is  then  to  find  the  distribution  of  the  77,  and  the  form  of 
the  piecewise  linear  functions  rl>\  for  each  component  i  of  77.  The  resulting 
problem  to  solve  is: 


min  c 1X1  + 


subject  to  AiXj  =  bi,xj  >  0.  ( SL ) 

To  find  77  and  rp  recursively,  assume  that  we  know  the  distribution  of  ijt. 
(Initially,  this  is  rj2  =  £2  )  The  piecewise  linear  functions  are  formed 

by  solving  the  parametric  linear  programs  in  a: 

min  c(x,  subject  to  Atxt  =  ±ae(i),xt  >  /?t)  (sub(t  -  i)) 

to  obtain  xf'  that  depends  on  the  parameter  a  and  the  lower  bound  /?t, 
which  is  generally  zero  but  may  have  negative  components  if  certain  random 
variables  are  bounded. 

If  we  restrict  ourselves  to  two  pieces,  the  function  is  sublinear,  so  that 

<(0)  =  (c«*;,',ir„(o>o  +  i»j«c*)<0)(i^< (*)!)•  We  then  have  that 

nt+i(j)  is  given  by: 


m# 

it+iij)  =  s.+i  o')  -  Bt{j,  .)[^(xt'int(i)>0  +  *r‘i.j,(o<o)(M0i)]- 

1=1 


(6) 


where  we  must  find  the  resulting  distribution.  Note  that  the  values  in  (6) 
are  linear  functions  of  77,  on  the  regions  where  rjt  has  constant  sign.  We 
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can,  therefore,  construct  Tjt+i  as  a  function  of  rjt  by  overlaying  these  linear 
transformations  of  random  variables.  For  normally  distributed  data,  this 
may  be  possible  since  the  transformation  will  not  affect  the  distribution 
class. 


5.  EXTENSIONS 


The  approximation  given  above  may  be  difficult  to  compute  if  normal 
distributions  are  not  present.  Instead  we  may  approximate  the  distribution 
of  r)t+i  .  This  requires  bounds  on  the  prob.{r)t(i)  >  0}  and  on  the  moments 
conditional  on  rj((z)  >  or  <  0.  Given  these  values,  moment  problems  can 
be  solved  to  calculate  corresponding  values  for  rjt+i  and  to  bound  ip\.  Any 
other  bounds  on  the  input  from  period  t  to  period  t  +  l  {Btxt)  can  also 
be  used  to  obtain  bounds  on  the  ip  values  so  that  crude  bounds  can  be 
obtained.  This  may  be  possible  in  special  cases. 

Note  also  that  certain  problems,  such  as  networks,  may  have  very  few 
variables  present  in  the  Bt(i,  )  term  in  (6)  (because  they  have  a  close-to- 
simple  recourse  structure),  so  that  r)t+l  may  be  easily  calculable  for  these 
problems.  These  special  cases  require  further  investigation. 

Another  looser  but  more  implementable  bound  can  be  obtained  by  forcing 
a  feasible  and  separable  response  in  all  future  periods  depending  on  a  single 
random  variable  in  the  current  period.  This  eliminates  the  problem  of 
characterizing  the  distribution  of  inputs  to  all  periods.  It  does,  however, 
force  a  dependency  in  future  periods  that  may  increase  the  bound. 

To  develop  this  response  function,  let  Xt(±i)  be  an  optimal  solution, 
it, ,  xt ,  to  : 


min 

cjxt  +... 

+  cfxr 

subject  to 

A,x( 

=  ±G 

Btx(  +At+iX(+i 

=  0, 

At  xt  -  0. 

xT  >  0,  T  =  t. 

. . . ,  T. 

Now  define  <p‘ 

■%(«■))  =  E{CfXt(±i)«t(i) 

1  -G('))±},  where  C,  = 

,ct )•  An  upper  bound  on  the  objective  value  of  (1)  is  obtained  by  solving 
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the  separable  nonlinear  program: 


min 

subject  to 


cfx  i  ...+c£xt 

<4i*i 

BtXt  +At+iit+i 


+E«£i(*‘-i(e.(0) 


-6+t 


1, 


x«  >  o,  t-  i _ r. 

*€S. 


(8) 

where  x,  is  the  support  set  of  the  random  variables.  Note  that  if  we  drop 
the  nonlinear  term  in  the  objective,  and  replace  £  in  the  constraints  with 
a  fixed  valued  of  E[£],  then  we  can  obtain  a  lower  bound  on  the  optimal 
objective  value  in  (1).  The  upper  bound  from  (8)  represents  allowing  some 
variation  of  the  choice  of  the  centering  point  £  so  that  a  best  approximation 
is  found.  The  value  of  (8)  is  an  upper  bound  because  the  composition  of 
the  xt  solutions  from  (8)  and  the  Xt  values  used  in  the  <p  terms  yield  a 
feasible  solution  for  all  £. 


This  procedure  nvy  also  be  implemented  as  response  to  several  scenarios. 
In  this  case,  the  random  vectors  are  partitioned  and  the  response  functions 
are  evaluated  over  the  partitions.  The  partitions  may  also  be  part  of  the 
higher  level  optimization  problem  so  that  in  some  way  a  ‘best”  partition 
may  be  found. 


The  points  used  within  the  partitions  may  be  chosen  as  expected  values, 
in  which  case,  the  solution  without  penalty  terms  is  again  a  lower  bound 
"ii  tin'  optimal  iihjivl  ivr  \  a  1 1 1<  -  For  an  upper  l-'onnd.  this  vertor  may  be 
allowed  to  take  on  any  value  in  the  partition. 

The  use  of  multiple  scenarios  enters  directly  into  the  scenario  aggrega¬ 
tion  approach  of  Rockafellar  and  Wets.  This  can  be  used  to  solve  the  top 
level  problem  and  to  approach  a  solution  that  is  optimal  for  a  given  set  of 
partitions  and  the  piecewise  linear  penalty  structure  presented  here. 

Computations  are  then  restricted  to  optimizing  separable  nonlinear  func¬ 
tions  subject  to  linear  constraints.  Implementations  can  be  based  on  pre¬ 
vious  procedures  (such  as  decomposition).  The  value  of  the  method  can, 
however,  only  be  demonstrated  by  solving  practical  problems  and  observing 
the  relative  errors.  Implementation  is  underway  to  achieve  this  goal. 
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